Skip to content

[Release SM 6.9] Cherry-Pick Fix rawBufferVectorLoad/Store to widen min precision types to 32-bit#8369

Merged
alsepkow merged 2 commits intomicrosoft:release-1.9.2602from
alsepkow:user/alsepkow/cherry-pick-dc4354b
Apr 15, 2026
Merged

[Release SM 6.9] Cherry-Pick Fix rawBufferVectorLoad/Store to widen min precision types to 32-bit#8369
alsepkow merged 2 commits intomicrosoft:release-1.9.2602from
alsepkow:user/alsepkow/cherry-pick-dc4354b

Conversation

@alsepkow
Copy link
Copy Markdown
Contributor

Cherry-pick PR (#8274) and revert of out-of-scope changes PR (#8321)

Assisted by gh copilot.

SHA dc4354b
SHA 71aa195

alsepkow and others added 2 commits April 13, 2026 20:29
…icrosoft#8274)

## Summary

Fixes `RawBufferVectorLoad`/`Store` to use 32-bit element types
(`i32`/`f32`) for min precision types (`min16int`, `min16uint`,
`min16float`) instead of 16-bit (`i16`/`f16`). This matches how
pre-SM6.9 `RawBufferLoad` handles min precision.

Resolves microsoft#8273

## Root Cause

`TranslateBufLoad` in `HLOperationLower.cpp` creates the vector type
directly from the min precision element type (`i16`/`f16`) without
widening to `i32`/`f32`. This causes WARP (and potentially other
drivers) to load/store 2 bytes per element instead of 4, mismatching the
buffer layout.

## Fix

Apply the same widening pattern used for bool types:
- **Load**: Load as `v_i32`/`v_f32`, then trunc/fptrunc back to
`i16`/`half`
- **Store**: `sext`/`fpext` to `i32`/`f32`, then store as
`v_i32`/`v_f32`

## Testing

Added FileCheck test verifying all 3 min precision types produce
`i32`/`f32` vector load/store ops.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Tex Riddell <texr@microsoft.com>
…icrosoft#8321)

## Summary

Reverts the `DxilGenerationPass` ByteAddressBuffer scalar store changes
and removes scalar store tests that were included in PR microsoft#8274. These
changes were out of scope for the vector load/store fix and, on further
discussion, were concluded to be incomplete.

Follow-up issue for the proper fix: microsoft#8322

## What changed

- **DxilGenerationPass.cpp**: Reverted the ByteAddressBuffer fallback
path added in `ReplaceMinPrecisionRawBufferStoreByType`. The `sext`
fallback was wrong for `min16uint` (loses signedness), and the fix
belongs in CodeGen where signedness info is still available.
- **min_precision_raw_load_store.hlsl**: Removed scalar load/store
tests. Scalar `ByteAddressBuffer::Store<min16int>()` hits a pre-existing
crash in `TranslateMinPrecisionRawBuffer` (`cast<StructType>` on
ByteAddressBuffer's `i32` inner element). Test now covers vector
loads/stores only, which is the scope of the original fix.

## Context

The original PR microsoft#8274 correctly fixed `RawBufferVectorLoad/Store` to
widen min precision types to 32-bit. However, it also added a
ByteAddressBuffer scalar store fix in `DxilGenerationPass` that:
1. Crept outside the scope of the vector load/store fix
2. Was incomplete — the `sext` fallback is wrong for unsigned types
(`min16uint`)
3. Should instead be handled during Clang CodeGen, where signedness
information is available

Scalar ByteAddressBuffer template store widening for min precision types
is a separate pre-existing issue that needs a proper fix in CodeGen.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines:
Successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Collaborator

@Icohedron Icohedron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed to be cherry-picks; the diff hashes match

@alsepkow alsepkow merged commit 9b2f583 into microsoft:release-1.9.2602 Apr 15, 2026
13 checks passed
@github-project-automation github-project-automation bot moved this from New to Done in HLSL Roadmap Apr 15, 2026
alsepkow added a commit that referenced this pull request Apr 15, 2026
…recision vector element access (#8370)

Cherry-pick PR (#8269)

Assisted by gh copilot.

Depends on: #8369

SHA
[ccba9f8](ccba9f8)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
alsepkow added a commit that referenced this pull request Apr 15, 2026
…cases to the long vector test (#8371)

Cherry-pick PR (#8260)

Assisted by gh copilot.

Depends on: #8369, #8370

SHA
[3531468](3531468)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
alsepkow added a commit that referenced this pull request Apr 15, 2026
… feature reference (#8372)

Cherry-pick PR (#8353)

Assisted by gh copilot.

Depends on: #8369, #8370, #8371

SHA
[f86cda3](f86cda3)

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants